Efficient graphics state management for zone rendering

ABSTRACT

The present invention provides a mechanism to track and manage graphics state with hardware state-binning logic for use with the tile-based zone rendering method of generating graphical images. Only the current values of the dynamic state variables are maintained in hardware. Dynamic includes, but is not limited to, state variables that are considered likely to change between primitives. The set of dynamic state variables is subdivided into subgroups. Each state group is associated with a per-bin array of tracking bits. Whenever a state change is encountered, the tracking bit corresponding to the associated state group is set for all bins. Prior to placing a primitive in a bin, the tracking bits associated with that bin are examined, and the current state corresponding to set tracking bits is inserted in the bin before the primitive. Then the tracking bits for that bin are cleared.

[0001] This application is a continuation of application Ser. No.10/039,007, filed Dec. 31, 2001.

BACKGROUND

[0002] 1. Field

[0003] The present invention relates generally to graphics systems andmore particularly to graphics-rendering systems.

[0004] 2. Background Information

[0005] Computer graphics systems are commonly used for displayinggraphical representations of objects on a two-dimensional video displayscreen. Current computer graphics systems provide highly detailedrepresentations and are used in a variety of applications. In typicalcomputer graphics systems, an object to be represented on the displayscreen is broken down into graphics primitives. Primitives are basiccomponents of a graphics display and may include points, lines, vectorsand polygons, such as triangles and quadrilaterals. Typically, ahardware/software scheme is implemented to render or draw the graphicsprimitives that represent a view of one or more objects beingrepresented on the display screen.

[0006] The primitives of the three-dimensional objects to be renderedare defined by a host computer in terms of primitive data. For example,when the primitive is a triangle, the host computer may define theprimitive in terms of X, Y and Z coordinates of its vertices, as well asthe red, green and blue (R, G and B) color values of each vertex.Additional primitive data may be used in specific applications.

[0007] Image rendering is the conversion of a high-level object-baseddescription into a graphical image for display on some display device.For example, an act of image rendering occurs during the conversion of amathematical model of a three-dimensional object or scene into a bitmapimage. Another example of image rendering is converting an HTML documentinto an image for display on a computer monitor. Typically, a hardwaredevice referred to as a graphics-rendering engine performs thesegraphics processing tasks. Graphics-rendering engines typically renderscenes into a buffer that is subsequently output to the graphical outputdevice, but it is possible for some rendering-engines to write theirtwo-dimensional output directly to the output device. Thegraphics-rendering engine interpolates the primitive data to compute thedisplay screen pixels that represent the each primitive, and the R, Gand B color values of each pixel.

[0008] A graphics-rendering system (or subsystem), as used herein,refers to all of the levels of processing between an application programand a graphical output device. A graphics engine can provide for one ormore modes of rendering, including zone rendering. Zone renderingattempts to increase overall 3D rendering performance by gaining optimalrender cache utilization, thereby reducing pixel color and depth memoryread/write bottlenecks. In zone rendering, a screen is subdivided intoan array of zones and per-zone instruction bins, used to hold all of theprimitive and state setting instructions required to render eachsub-image, are generated. Whenever a primitive intersects (or possiblyintersects) a zone, that primitive instruction is placed in the bin forthat zone. Some primitives will intersect more than one zone, in whichcase the primitive instruction is replicated in the corresponding bins.This process is continued until the entire scene is sorted into thebins. Following the first pass of building a bin for each zoneintersected by a primitive, a second zone-by-zone rendering pass isperformed. In particular, the bins for all the zones are rendered togenerate the final image.

[0009] In order to implement a tile-rendering architecture like zonerendering, the maintenance of the correct graphics-rendering statevariables within each image-space zone (i.e. bin) is very important, inthat it is required to subsequently render (during the rendering phase)each bin's primitives with the graphics state that existed at the timethe primitive was encountered during the binning phase.

[0010] One conventional method of associating primitives with theirappropriate graphics state would be to separately maintain a copy of allencountered graphics states and associate each primitive with some tag(i.e. index) identifying the specific state to be later used forrendering the primitive. However, the complexity of maintaining aseparate state table and the cost (in required memory footprint, latencyand bandwidth) of loading complete state sets-possibly between eachprimitive-can be prohibitive in low-cost and bandwidth-constrained(e.g., integrated) graphics systems.

[0011] What is needed therefore is a method, apparatus and system forgraphics state management for zone rendering that is less costly andmore efficient.

BRIEF DESCRIPTION OF THE DRAWINGS

[0012]FIG. 1 illustrates a block diagram of an embodiment of a computersystem including an embodiment of a graphics device for automatic memorymanagement for zone rendering.

[0013]FIG. 2 illustrates a block diagram of an embodiment of a graphicsdevice including a graphics-binning engine for processing a scene inputlist including delta states, graphics-rendering engine and bins.

[0014]FIG. 3 illustrates a depiction of an embodiment of a zone rendererscreen view including zones and geometrical primitives.

[0015]FIG. 4 illustrates a block diagram of an embodiment of dynamicstate subgroups and per-bin tracking bits.

[0016]FIG. 5 illustrates a flow diagram of an embodiment of a processfor outputting any required state changes prior to placing a primitivein a bin and of a process for using the stored current dynamic state andthe per-bin tracking bit vectors to efficiently manage state changesduring the scene capture phase of zone rendering.

[0017]FIG. 6 illustrates a flow diagram of an embodiment of a processfor an optimization for detecting which texture maps and texture blendstages are required.

DETAILED DESCRIPTION

[0018] The present invention provides a cost-effective mechanism totrack and manage graphics state with hardware state-binning logic foruse with the tile-based zone rendering method of generating graphicalimages. In accordance with an embodiment of the present invention, onlythe current values of the dynamic state variables are maintained inhardware. In one embodiment, dynamic state variables include, but arenot limited to, variables that are considered likely to change betweenprimitives. State variables that remain constant or fairly constantduring typical scenes are typically excluded. The set of dynamic statevariables is subdivided into subgroups. Each state subgroup isassociated with a per-bin array of tracking bits. Whenever a statechange is encountered during the binning phase, the tracking bitcorresponding to the associated state group is set for all bins. Priorto placing a primitive in a bin, the tracking bits associated with thatbin are examined, and the current state corresponding to set trackingbits is inserted in the bin before the primitive. The tracking bits forthat bin are then cleared.

[0019] As discussed in detail below, the present invention optimizeszone rendering support in that it removes the need to track statechanges in the driver software thus increasing performance and reducingdriver complexity. The cost of binning state management is minimized byonly supporting (a single instanced of) the on-chip storage of dynamicstate variables while providing a means to effect changes to any statevariable and reducing the on-chip per-bin storage to tracking bits,typically four per bin. In a typical embodiment, the on-chip per-binstorage is reduced to only four tracking bits. Additionally therequirements for state-change bandwidth and footprint are reduced by (a)collapsing back-to-back state changes within a subgroup, (b) eliminatingupdates of non-required texture blend stage and texture map state data,and (c) providing optimized (and low-latency) instructions for statesubgroup changes. Moreover, the cost and complexity of managingindirectly stored state arrays and/or caches are reduced.

[0020] In the detailed description, numerous specific details are setforth in order to provide a thorough understanding of the presentinvention. However, it will be understood by those skilled in the artthat the present invention maybe practiced without these specificdetails. In other instances, well-known methods, procedures, componentsand circuits have been described in detail so as not to obscure thepresent invention.

[0021] Some portions of the detailed description that follow arepresented in terms of algorithms and symbolic representations ofoperations on data bits or binary signals within a computer. Thesealgorithmic descriptions and representations are the means used by thoseskilled in the data processing arts to convey the substance of theirwork to others skilled in the art. An algorithm is here, and generally,considered to be a self-consistent sequence of steps leading to adesired result. The steps include physical manipulations of physicalquantities. Usually, though not necessarily, these quantities take theform of electrical or magnetic signals capable of being stored,transferred, combined, compared, and otherwise manipulated. It hasproven convenient at times, principally for reasons of common usage, torefer to these signals as bits, values, elements, symbols, characters,terms, numbers or the like. It should be understood, however, that allof these and similar terms are to be associated with the appropriatephysical quantities and are merely convenient labels applied to thesequantities. Unless specifically stated otherwise as apparent from thefollowing discussions, it is appreciated that throughout thespecification, discussions utilizing such terms as “processing” or“computing” or “calculating” or “determining” or the like, refer to theaction and processes of a computer or computing system, or similarelectronic computing device, that manipulate and transform datarepresented as physical (electronic) quantities within the computingsystem's registers and/or memories into other data similarly representedas physical quantities within the computing system's memories, registersor other such information storage, transmission or display devices.

[0022] Embodiments of the present invention may be implemented inhardware or software, or a combination of both. However, embodiments ofthe invention may be implemented as computer programs executing onprogrammable systems comprising at least one processor, a data storagesystem (including volatile and non-volatile memory and/or storageelements), at least one input device, and at least one output device.Program code may be applied to input data to perform the functionsdescribed herein and generate output information. The output informationmay be applied to one or more output devices, in known fashion. Forpurposes of this application, a processing system includes any systemthat has a processor, such as, for example, a digital signal processor(DSP), a micro-controller, an application specific integrated circuit(ASIC), or a microprocessor.

[0023] The programs may be implemented in a high level procedural orobject oriented programming language to communicate with a processingsystem. The programs may also be implemented in assembly or machinelanguage, if desired. In fact, the invention is not limited in scope toany particular programming language. In any case, the language may be acompiled or interpreted language.

[0024] The programs may be stored on a storage media or device (e.g.,hard disk drive, floppy disk drive, read only memory (ROM), CD-ROMdevice, flash memory device, digital versatile disk (DVD), or otherstorage device) readable by a general or special purpose programmableprocessing system, for configuring and operating the processing systemwhen the storage media or device is read by the processing system toperform the procedures described herein. Embodiments of the inventionmay also be considered to be implemented as a machine-readable storagemedium, configured for use with a processing system, where the storagemedium so configured causes the processing system to operate in aspecific and predefined manner to perform the functions describedherein.

[0025] An example of one such type of processing system is shown inFIG. 1. Sample system 100 may be used, for example, to execute theprocessing for methods in accordance with the present invention, such asthe embodiment described herein. Sample system 100 is representative ofprocessing systems based on the microprocessors available from IntelCorporation, although other systems (including personal computers (PCs)having other microprocessors, engineering workstations, set-top boxesand the like) may also be used. In one embodiment, sample system 100 maybe executing a version of the WINDOWS.TM. operating system availablefrom Microsoft Corporation, although other operating systems andgraphical user interfaces, for example, may also be used.

[0026]FIG. 1 is a block diagram of a system 100 of one embodiment of thepresent invention. The computer system 100 includes central processor102, graphics and memory controller 104 including graphics device 106,memory 108 and display device 114. Processor 102 processes data signalsand may be a complex instruction set computer (CISC) microprocessor, areduced instruction set computing (RISC) microprocessor, a very longinstruction word (VLIW) microprocessor, a process implementing acombination of instruction sets, or other processor device, such as adigital signal processor, for example. Processor 102 may be coupled tocommon bus 112 that transmits data signals between processor 102 andother components in the system 100. FIG. 1 is for illustrative purposesonly. The present invention can also be utilized in a configurationincluding a descrete graphics device.

[0027] Processor 102 issues signals over common bus 112 forcommunicating with memory 108 or graphics and memory controller 104 inorder to manipulate data as described herein. Processor 102 issues suchsignals in response to software instructions that it obtains from memory108. Memory 108 may be a dynamic random access memory (DRAM) device, astatic random access memory (SRAM) device, or other memory device.Memory 108 may store instructions and/or data represented by datasignals that may be executed by processor 102, graphics device 106 orsome other device. The instructions and/or data may comprise code forperforming any and/or all of the techniques of the present invention.Memory 108 may also contain software and/or data. An optional cachememory 110 may be used to speed up memory accesses by the graphicsdevice 106 by taking advantage of its locality of access. In someembodiments, graphics device 106 can offload from processor 102 many ofthe memory-intensive tasks required for rendering an image. Graphicsdevice 106 processes data signals and may be a complex instruction setcomputer (CISC) microprocessor, a reduced instruction set computing(RISC) microprocessor, a very long instruction word (VLIW)microprocessor, a process implementing a combination of instructionsets, or other processor device, such as a digital signal processor, forexample. Graphics device 106 may be coupled to common bus 112 thattransmits data signals between graphics device 106 and other componentsin the system 1.00, including render cache 110 and display device 114.Graphics device 106 includes rendering hardware that among other thingswrites specific attributes (e.g. colors) to specific pixels of display114 and draw complicated primitives on display device 114. Graphics andmemory controller 104 communicates with display device 114 fordisplaying images rendered or otherwise processed by a graphicscontroller 104 for displaying images rendered or otherwise processed toa user. Display device 114 may comprise a computer monitor, televisionset, flat panel display or other suitable display device.

[0028] Memory 108 stores a host operating system that may include one ormore rendering programs to build the images of graphics primitives fordisplay. System 100 includes graphics device 106, such as a graphicsaccelerator that uses customized hardware logic device or a co-processorto improve the performance of rendering at least some portion of thegraphics primitives otherwise handled by host rendering programs. Thehost operating system program and its host graphics application programinterface (API) control the graphics device 106 through a driverprogram.

[0029] Referring to FIGS. 2 and 3, an embodiment 160 of various graphicsobjects, for example geometric primitives (i.e. triangles, lines) 162,implemented on a zone rendering system 120 is illustrated. In zonerendering, a screen is subdivided into an array of zones 164 commonlyscreen-space rectangles although other geometric variants may be used aswell. Each zone 164 is associated with a bin. Each bin 128 includes achained series of command buffers 134 stored within non-contiguousphysical memory pages. The bins 128 are thus preferably implemented as achain of independent physical pages.

[0030] When a primitive 162 intersects a zone 164, the correspondingprimitive instruction is placed in the bin 128 associated with the zone164 intersected. Per-zone instruction bins 128 are thus used to holdprimitive instructions and state setting instructions required to rendereach sub-image and are generated by comparing the screen-space extent ofeach primitive 162 to the array of zones 164. Thus, as the primitives162 are received, the present invention determines which zone(s) 164each primitive 162 intersects, and replicates the primitive instructionsinto a bin 128 associated with each of these zones 164. The process ofassigning primitives (and their attributes) 142 to zones 164 is referredto as binning. “Bin” 128 refers to the abstract buffer used for eachzone—where a bin 128 will typically be realized as a series ofinstruction batch buffers 134. Binning performs the necessarycomputations to determine what primitives 162 lie in what zones 164 andcan be performed by dedicated hardware and/or software implementations.In one typical implementation, a driver 122 writes out a set of commandsto be parsed by the graphics-binning engine 126 for each zone 164intersected by a primitive 162 and the commands are written into buffers134 associated with the zones 164 intersected.

[0031] Some primitives 162 will intersect more than one zone 164, inwhich case the primitive instruction is replicated in bins 128corresponding to the intersected zones 164. For example, the lighteningbolt depicted in FIG. 3 intersects nine zones 164. This process iscontinued until the entire scene is sorted into bins 128.

[0032] Once all the primitives 162 are sorted and the command structurescompleted, a second pass is made to render the scene one zone 164 at atime. Following the first pass of building a bin for each zone 164intersected by a primitive 162, a second zone-by-zone rendering pass isperformed. In particular, the bins 128 for all the zones 164 arerendered to generate the final image, with each scene rendered one zone164 at a time. The order with which the zones 164 are rendered is notsignificant. All bins 128 associated with primitives 162 that touchpixels within a particular zone 164 are rendered before the next zone164 is rendered. A single primitive 162 may intersect many zones 164,thus requiring multiple replications. As a result, primitives 162 thatintersect multiple zones 164 are rendered multiple times (i.e. once foreach zone 164 intersected).

[0033] Rendering performance improves as a result of the primitives 162being decomposed into zones 164 that are aligned to the render cache110. Since the graphics device 106 is only working on a small portion ofthe screen at a time (i.e. a zone 164), it is able to hold the framebuffer contents for the entire zone 164 in a render cache 110. Thedimensions of the zone 164 are typically a constant tuned to the sizeand organization of the render cache 110. It is by this mechanism thatthe render cache 110 provides optimal benefits—reuse of cached data ismaximized by exploiting the spatial coherence of a zone 164. Through useof the zone rendering mode, only the minimum number of color memorywrites need be performed to generate the final image one zone 164 at atime, and color memory reads and depth memory reads and writes can beminimized or avoided altogether. Use of the render cache 110 thussignificantly reduces the memory traffic and improves performancerelative to a conventional renderer that draws each primitive completelybefore continuing to the next primitive.

[0034] Referring to FIG. 2, in a typical implementation, a graphicsprimitive and state-setting instruction stream, referred to as a sceneinput list 124, is initially applied to graphics-binning engine ringbuffer 125 associated with graphics-binning engine 126. The scene inputlist 124 may be a single, temporally-ordered scene description (asreceived by the application programming interface). Graphics-binningengine 126 is typically implemented as a hardware binning engine (HWB)126. One skilled in the art will recognize that a software or softwareplus hardware binner could be used as well. The graphics-binning engine126 parses scene input list 124 and determines which zone(s) 164 eachprimitive 162 intersects.

[0035] As previously noted, the zones 164 are associated with bins 128.Graphics-binning engine 126 compares the screen-space extent of eachprimitive 162 to the array of zones 164, and replicates the associatedprimitive commands into corresponding bins 128. As shown in FIG. 5 anddescribed in detail below, bins 128 are comprised of chained series ofcommand buffers 134 typically stored within non-contiguous physicalmemory pages. A bin list is a list of buffers 134 which comprise eachbin 132. Pages are initially allocated to the BMP 140. The bin pointerlist 130 is initialized with the page numbers of the pages and stores awrite pointer into the bin list 132.

[0036] The graphics-binning engine 126 also maintains the currentgraphics state by parsing associated state-setting instructionscontained with the scene input list 124. Prior to placing a primitivecommand in any given bin 128, the graphics-binning engine 126 typicallyprecedes the primitive command in the bin 128 with any requiredstate-setting instructions.

[0037] After the scene input list 124 has been completely parsed, theassociated bins (i.e. bin 0, bin 1 . . . bin n−1) are ready to be usedby the graphics-rendering engine 136 to render the scene. As discussedin detail below, instructions are included at the end of the scene inputlist 124 to cause the graphics-binning engine 126 to increment theregister in pending scene counter 148 by one and initiate rendering ofthe binned scene. For example, graphics-binning engine 126 sends arender instruction to graphics-rendering engine ring buffer 157associated with graphics-rendering engine 136 via path 156.

[0038]FIG. 4 illustrates a block diagram of an embodiment 170 of currentdynamic state subgroups 172 and per-bin tracking bits 174. Thesubdivision of dynamic states into subgroups 172 provides a level ofgranularity for tracking and effecting changes to the dynamic statevariables. In a typical embodiment, the set of dynamic state variablesis subdivided into four subgroups 172. One skilled in the art willrecognize that four subgroups 172 are advantageous for the configurationshown and discussed herein, however, the present invention can beutilized with any number of subgroups 172 configured any number of ways.

[0039] Associated with each bin 128 (e.g. bin 0, bin 1, bin 2 . . . binn) is a plurality of tracking bits 174, with each bin bit associatedwith a particular dynamic state subgroup 172. When a tracking bit 174for a particular bin 128 is “set,” it is an indication that some statevariable within that subgroup 172 (and for that particular bin 128,e.g., bin n) has changed since the time a primitive 162 was last outputto that bin 128. Conversely, a “cleared” tracking bit 174 indicates thatthe associated dynamic state subgroup 172 for that bin 128 has notchanged since the time a primitive 162 was last output to that bin 128.In a typical implementation, a per-bin, 4-bit “tracking bit” vector 174is used to track changes to the four state groups 172. The “texturemap”, “texture blend”, “basic state” and “slow state” subgroups eachtypically contain 512 or 1024 bins. With 512 bins, this amounts to 2Kbits total (4-bit vector per bin*512 bins). With 1024 bins, this amountsto 4K bits total (4-bit vector per bin*1024 bins).

[0040] Initially, all tracking bits 174 are “set” in order to initializeeach bin 128 with a complete complement of dynamic state. In particular,prior to placing a primitive 162 in a bin 128, the tracking bits 174associated with that bin 128 are examined, and the current statecorresponding to set tracking bits 174 is inserted in the bin 128 beforethe primitive 162. Then the tracking bits 174 for that bin 128 arecleared. In a typical embodiment, many of the state bits 174 do notchange very often. However, any of the bits 174 that have changed overtime must be identified and issued, or a larger group of them issued, tothe bin 128 such that the precise state for the triangle can bemaintained during rasterization.

[0041] In particular, the state groups 172 shown in FIG. 4 include:

[0042] (1) “Basic State” Subgroup 176—The state variables associatedwith the Basic State tracking bit array, include but are not limited to,vertex-buffer, vertex-format, setup, texel stream and pixel pipelinestate variables. The state variables are typically arranged into a fixedsequence of words. In one particular embodiment, any change to a basicstate variables will require all the words to be issued to the requiredbins 128 although one skilled in the art will recognize that otherconfigurations may be used as well. In another embodiment, more granulartracking of basic state changes for either all bins 128 or a smallersubset of “open” bins 128 is implemented.

[0043] (2) “Texture Map” Subgroup 178—The state variables associatedwith the Texture Map State tracking bit array, include but are notlimited to, most texture map parameters, cube map face enables andtexture filter parameters. State variables are output by thegraphics-binning engine 126 as part of an instruction. When the TextureMap State tracking bit for a particular bin 128 is found set, thegraphics-binning engine 126 will only output the words associated withtexture maps that are currently required by the current contextsettings. The “currently used” maps are determined by thegraphics-binning engine 126 by examining the enabled texture blendstages, seeing which texel streams are required as input, and thenexamining which texture maps are associated with those required texelstreams.

[0044] (3) “Texture Blend” Subgroup 180—The state variables associatedwith the Texture Blend State tracking bit array, include but are notlimited to, state variables used to control texture map blend stages. Ina typical implementation, the state variables include global control andtexture blend color, alpha, control stage parameters for controlling oneto four texture map blend stage units. These state variables are outputby the graphics-binning engine 126 as part of an instruction. When theTexture Blend State tracking bit for a particular bin 128 is found set,the graphics-binning engine 126 will only output those words required bythe current setting of the “number of enabled texture blend stages”derived state variable.

[0045] (4) “Slow State” Subgroup 182—The state variable associated withthe Slow State tracking bit array is a “Slow State Pointer.” The SlowState pointer indirectly controls any state pointer not included in theother state subgroups such as the Basic, Texture Blend and Texture Mapstate subgroups 176, 178 and 180. In a typical embodiment, low-levelstate change and/or infrequently changing instructions are placed in theslow state buffers 166 and the pointers are only passed into those slowstate buffers 166 via an instruction 167.

[0046] “Slow state” group contains the remaining non-pipelined (and afew low-frequency pipelined) state variables. Changes to “slow state”state variables are not directly sent to the graphics-binning engine126. Rather, as shown in FIG. 2, slow state buffers 166 containinginitial+delta state changes for these variables are built such that asingle pointer into one of these slow state buffers 166 is sufficient todefine the current state of all the “slow” state variables. This way,only the slow state pointer needs to be sent to the graphics-binningengine 126. Changes to the slow state pointer will be placed in bins 128as required, and during rendering will initiate the required reads ofthe slow state buffers 166 to update slow state variables. Thegraphics-rendering engine 136 will execute the buffered instructionsbetween the previous and new values of the slow state “pointers,” thoughonly if the new value falls between the previous value and the end ofthe page being rendered. Otherwise the graphics-rendering engine 136will execute the instructions from the top of the page specified by thenew slow state pointer up to but not including the Dword specified bythe new slow state pointer. This will thereby set all slow statevariables to their initial values and then apply all delta state changesup to but not including the Dword specified by the new slow statepointer.

[0047] In particular, the graphics primitive and state-settinginstruction stream, referred to as a scene input list 124, is initiallyapplied to graphics-binning engine ring buffer 125 associated withgraphics-binning engine 126. The graphics-binning engine 126 parsesscene input list 124 and determines which zone(s) 164 each primitive 162intersects.

[0048] The graphics-binning engine 126 maintains the current graphicsstate by parsing associated state-setting instructions contained withthe scene input list 124. Prior to placing a primitive command in anygiven bin 128, the graphics-binning engine 126 precedes the primitivecommand in the bin 128 with the state-setting instructions 167.

[0049] If the Slow State tracking bit 183 for bin 0 indicates a change,the slow state “pointer” is output to the bin 128. The Slow Statetracking bit 183 is then cleared. If the Slow State tracking bit 183does not indicate a change, the slow state pointer is not output to bin0.

[0050] Slow state buffers 166 store initial and delta state changes forthe slow state variables. A single pointer into one of these slow statebuffers 166 is sufficient to define the current state of all the “slow”state variables. The slow state buffers 166 may be located in eitherstate memory 108 or a dedicated memory. In a typical embodiment,low-level state changes and/or infrequently changing instructions areplaced in the slow state buffers 166 and binner pointers are only passedinto those slow state buffers via an instruction 167. Once all theprimitives and state instructions including slow state pointers arebinned, a second pass is made to render the scene one zone 164 at atime. The bins 128 are rendered to generate the final image, with eachscene rendered one zone 164 at a time. During rendering, the graphicrendering engine 136 initiates the required reads of the slow statebuffers 166, based upon the binned slow state pointer, to update slowstate variables. The graphics-rendering engine 136 will execute thebuffered instructions between the previous and new values of the slowstate “pointers.”

[0051]FIG. 5 is a flow diagram illustrating an embodiment 190 foroutputting any required state changes prior to placing a primitiveinstruction associated with a primitive 162 in a bin 128 (steps 192-208)and for examining delta state changes and updating state and per-bintracking bits 174 (steps 212-222). The optimizations minimizeunnecessary state replication.

[0052] In particular, prior to placing a primitive instruction in aparticular bin 128, the graphics-binning engine 126 ensures that thestate for the particular bin 128 is current at least to the point thatthe primitive 162 can be rendered correctly during the rendering phase.This means that a state that is not currently used does not have to beoutput prior to outputting the primitive instruction into the bin 128.For example, disabled texture blend state settings and unused texturemap settings do not have to be output to the bin 128.

[0053] If a primitive 162 is encountered (step 192), the presentinvention determines which zones 164 the primitive 162 intersects (step194). For each bin 128 associated with the zone 164 intersected (step196), each subgroup tracking bit 174 is examined (step 198). If thetracking bit 174 for a particular subgroup is set (step 199), thecurrent values of the particular subgroup 172 are output to the bin 128(step 200). A tracking bit 174 is considered “set” when a state subgroup172 associated with the tracking bit 174 becomes “used.”

[0054] For example, the subgroup tracking bit 177 for the Basic StateSubgroup 176 may initially be examined for bin 0. The subgroup trackingbit 174 for that particular bin 128 is then cleared (step 202). Else ifthe tracking bit for a particular subgroup is not set, no current valuesof the particular subgroup 172 are output to the bin 128 (step 201).

[0055] The next subgroup tracking bit 174 for the bin 128 is thenexamined and steps 198, 200 and 202 are repeated for each bit 174 in thebin 128 (step 204). For example, the tracking bit 179 for the TextureMap Subgroup 178 for bin 0 may be examined next.

[0056] The primitive instructions are output to the bin 128 after allthe tracking bits 174 associated with the subgroups 172 have beenexamined for the particular bin 128 (step 206). The tracking bits 174 ofthe subgroups 172 for the next bin 128 associated with the zone 164 thatis intersected by the primitive 162 is then examined (step 208).

[0057] For example, referring to FIG. 4, if a Basic State tracking bit177 associated with bin 0 indicates a change, the current values of theBasic State Subgroup 176 are output to bin 0. The Basic State trackingbit 177 is then cleared. If the Basic State tracking bit 177 does notindicate a change, no values are output to bin 0.

[0058] The tracking bit 174 for the next subgroup 172 is then examined.For example, if the Texture Map State tracking bit 179 for bin 0indicates a change, the current values of the Texture Map State 178 areoutput. The Texture Map State tracking bit 179 is then cleared. If theTexture Map State tracking bit 179 does not indicate a change, no valuesare output to bin 0.

[0059] Similarly, if the Texture Blend State tracking bit 181 for bin 0indicates a change, the current values of the Texture Blend State 180are output. The Texture Map State tracking bit 181 is then cleared. Ifthe Texture Map State tracking bit 179 does not indicate a change, novalues are output to bin 0.

[0060] If the Slow State tracking bit 183 for bin 0 indicates a change,the slow state pointer is output to bin 0. The Slow State tracking bit183 is then cleared. If the Slow State tracking bit 183 does notindicate a change, the slow state pointer is not output to bin 0.

[0061] The tracking bit 174 thus eliminates the possibility that a staleresource will be used. For example, if (a) a change is made to acurrently unused texture map, (b) a primitive 162 is drawn (clearing therespective texture map tracking bit 179 without outputting the unusedstate), and (c) a state change is made such that the texture map is nowused “as is” (i.e., without a change to the map itself). Without anyspecial handling, the stale texture map would be incorrectly used. Inthe present invention, the texture map tracking bit 179 is set when atexture map becomes “used” (as a result of a state change to the texturemap state) thus eliminating the possibility that a stale resource willbe used.

[0062] Optimized versions of state-setting graphics instructions areused to independently update each of the dynamic state variablesubgroups 172. By including these optimized instructions within the bins128, the complexity and latency of reading indirect state data isgreatly reduced, thus increasing performance while lowering cost. Inparticular, FIG. 5 illustrates examining delta state changes andupdating state and per-bin tracking bits 174 (steps 212-222). Thegraphics-binning engine 126 prevents unnecessary broadcasting of allstate changes to all bins 128 by maintaining and tracking changes tostates on a per-bin 128 and per-subgroup 172 basis. The state for a bin128 is updated advantageously just prior to the primitive instructionbeing placed in the bin 128. Multiple changes occurring within the samestate subgroup 172 between primitive instructions being binned arecollapsed into one subgroup change output to the bin 128.

[0063] Steps 212-222 are implemented for using the stored currentdynamic state and the per-bin tracking bits 174 to efficiently managestate changes during the scene capture phase. By tracking state changeson a per bin 128 and per state group 172 basis, the graphics-binningengine 126 will also only update those state subgroups 172 that havechanged for a particular intersected bin 128 since the last time aprimitive 162 was placed in that bin 128.

[0064] Initially, if a state change is encountered (step 212), thecorresponding dynamic state subgroup 172 is determined (step 214). Theprevious value of the particular state variable is then determined (step216).

[0065] If the state's new value differs from the state's current value(step 218), the corresponding subgroup tracking bit 174 for the bin 128is set (step 220). The current state with the state's new value is thenupdated (step 222).

[0066] Slow State Pointer

[0067] If a state change modifies a “slow state pointer,” the Slow Statetracking bit (i) for each bin (i) is set. The state changes are appliedto the current state for each of the subgroups affected.

[0068] Basic State

[0069] For example, if a state change modifies a “basic state,” theBasic State tracking bit (i) for each bin (i) is set.

[0070] Texture Map

[0071] If a state change causes any of the texture maps required (0 . .. 3) to become set, the Texture Map State tracking bit (i) for each bin(i) is set. If a state change modifies the texture maps (0 . . . 3), theTexture Map State tracking bit (i) for each bin (i) is set.

[0072] Texture Blend

[0073] If a state change modifies a required “color factor,” the TextureBlend State tracking bit (i) for each bin (i) is set. If a state changeincreases the number of enabled texture blend stages, the Texture BlendState tracking bit (i) for each bin (i) is set.

[0074]FIG. 6 is a flow diagram illustrating an embodiment 230 of anoptimization for specifically detecting which, if any, states associatedwith a particular texture map or texture blend stage needs to be outputto the bin 128. In particular, state information associated with aparticular texture map or texture blend stage are only output to the bin128 when subgroup tracking bits 174 associated with texture map ortexture blend states are found set. Only information associated with theused texture map or texture blend stage is output to the bin 128. Thisprevents stale texture blend or texture map states from being used byensuring that only the newly required state or map is output tointersecting bins. The subgroup tracking bits 174 are set when (a) achange is made to the state associated with a currently used map/stage,or (b) when a previously unused map/stage becomes “used” via a change tosome other state variable. Regarding the latter, for example, a textureblend or texture map becomes “used” as a result of an associated statechange (e.g., a basic state change).

[0075] For each bin 128 the primitive 162 intersects (step 232), thetexture map and/or blend sub group tracking bit 174 associated with thatbin 128 is examined (step 234). If the tracking bit 174 is set (step240), the maps/stages are examined (steps 238-244). If there is a change(step 240) in a texture map/stage, the texture map/stage is output tothe bin 128 (step 242). The next texture map/stage is then examined(step 244). For example, if the Texture Blend State tracking bit (i) isset (step 236), then the particular texture blend stage(s) that havechanged are output to the bin 128. In particular, if the texture blendstage 0 changed, the texture blend stage 1 is output. If the textureblend stage 1 changed, the texture blend stage 2 is output. If thetexture blend stage 2 has reached it's last stage, the texture blendstage 3 is output and so forth. The Texture Blend State tracking bit isthen cleared (step 246).

[0076] If the Texture Map State tracking bit (i) is set (step 232), thenthe particular texture map(s) that have changed are output to the bin128. In particular, if the texture map [0] is changed, the texture map 0is output. If the texture map [1] is changed, the texture map 1 isoutput. If the texture map [2] is changed, the texture map 2 is output.If the texture map [3] is changed, the texture map 3 is output and soforth. The Texture Map State tracking bit is then cleared (step 246).

[0077] If a tracking bit 174 does not indicate a change, the primitive162 is drawn, clearing the respective texture blend stage or texturemap/stage tracking bit without outputting the non-required state to thebin 128.

[0078] Having now described the invention in accordance with therequirements of the patent statutes, those skilled in the art willunderstand how to make changes and modifications to the presentinvention to meet their specific requirements or conditions. Suchchanges and modifications may be made without departing from the scopeand spirit of the invention as set forth in the following claims.

What is claimed is:
 1. A method for managing state variables forrendering primitives, comprising: defining a plurality of memory areasfor storing instructions and current state information associated withthe primitives; defining state variables associated with the primitives;sorting state variables into a plurality of subgroups; associating eachof the plurality of memory areas with a plurality of tracking bits,wherein each tracking bit is associated with a subgroup and one of theplurality of memory areas; setting the tracking bit in response to achange in a state variable since the primitive instructions werepreviously output to the memory areas; and outputting current stateinformation associated with the subgroup into one of the plurality ofmemory areas in response to the tracking bit being set.
 2. The method ofclaim 1 further comprising: clearing the tracking bit after the currentstate information has been output to the one of the plurality of memoryareas.
 3. The method of claim 1 wherein sorting state variables into aplurality of subgroups further comprises: sorting state variablesassociated with basic state functions.
 4. The method of claim 1 whereinsorting state variables into a plurality of subgroups further comprises:sorting state variables associated with texture map functions.
 5. Themethod of claim 1 wherein sorting state variables into a plurality ofsubgroups further comprises: sorting state variables associated withtexture blend functions.
 6. The method of claim 1 wherein sorting statevariables into a plurality of subgroups further comprises: sorting statevariables associated with slow state functions.
 7. The method of claim 1wherein setting the tracking bit in response to a change in a statevariable since the primitive instructions were previously output to thememory areas further comprises: determining whether a state change hasoccurred; determining which subgroup is associated with the statechange; determining a new and previous value of the state variable; andsetting the corresponding subgroup tracking bit if the new and previousvalues differ.
 8. The method of claim 1 further comprising: defining aplurality of additional memory areas for storing state information;defining a reference for the plurality of additional memory areas; andstoring selected state group information into the plurality ofadditional memory areas.
 9. The method of claim 8 wherein outputtingcurrent state information associated with the subgroup into one of theplurality of memory areas in response to the subgroup being set furthercomprises: outputting the reference associated with the subgroup intoone of the plurality of memory areas when the tracking bit is set; andbased on the reference, retrieving referenced information stored in oneof the plurality of additional memory areas during rendering.
 10. Themethod of claim 9 wherein storing selected state group information intothe additional memory areas further comprises: storing state variablesassociated with slow state functions into the plurality of additionalmemory areas.
 11. The method of claim 10 wherein defining a referencefor the additional memory areas further comprises: defining a pointer tothe additional memory areas where slow state information is stored. 12.An apparatus for rendering a scene including primitives, comprising: aplurality of binning memory areas associated with regions that areintersected by primitives; a tracking memory area for storing indicatorsof state variables that been affected since primitive instructions werepreviously output to the memory areas; a binning engine, responsive tothe tracking indicators, for binning current state information of statevariables; and a rendering engine for rendering the current informationstored in the memory areas.
 13. The apparatus of claim 12 wherein thecurrent information comprises state variables that have changed sinceprimitive instructions were previously output to the memory areas. 14.The apparatus of claim 12 wherein the current information comprises areference.
 15. The apparatus of claim 13 wherein the state variables areassociated with at least one state group.
 16. The apparatus of claim 15wherein the at least one state group comprises basic state functions.17. The apparatus of claim 15 wherein the at least one state groupcomprises texture map functions.
 18. The apparatus of claim 15 whereinthe at least one state group comprises texture blend functions.
 19. Theapparatus of claim 15 wherein the at least one state group comprisesslow state functions.
 20. The apparatus of claim 12 wherein the trackinginformation comprises a plurality of tracking bits wherein each trackingbit is associated with a binning memory area and a subgroup.
 21. Theapparatus of claim 12 wherein a tracking memory area for storingindicators of state variables that been affected since primitiveinstructions were previously output to the memory areas furthercomprises: a tracking memory area for storing indicators of statevariables that changed since primitive instructions were previouslyoutput to the memory areas.
 22. The apparatus of claim 12 wherein atracking memory area for storing indicators of state variables that beenaffected since primitive instructions were previously output to thememory areas further comprises: a tracking memory area for storingindicators of state variables that were used since primitiveinstructions were previously output to the memory areas.
 23. Theapparatus of claim 14 further comprising: a plurality of additionalmemory areas.
 24. The apparatus of claim 23 wherein the plurality ofadditional memory areas stores slow state variables.
 25. The apparatusof claim 24 wherein the reference comprises a pointer into theadditional memory areas.
 26. The apparatus of claim 25 wherein thebinning engine, responsive to the tracking indicators, bins thereference in one of the plurality of binning memory areas.
 27. Theapparatus of claim 26 wherein the rendering engine utilizes thereference to retrieve slow state information from at least one of theplurality of additional memory areas.
 28. A machine readable mediumhaving stored therein a plurality of machine readable instructionsexecutable by a processor to manage states for rendering primitives, themachine readable instructions comprising: instructions to define aplurality of memory areas for storing instructions and current stateinformation associated with the primitives; instructions to define statevariables associated with the primitives; instructions to sort statevariables into a plurality of subgroups; instructions to associate eachof the plurality of memory areas with a plurality of tracking bits,wherein each tracking bit is associated with a subgroup and one of theplurality of memory areas; instructions to set the tracking bit inresponse to a change in a state variable since the primitiveinstructions were previously output to the memory areas; andinstructions to output current state information associated with thesubgroup into one of the plurality of memory areas in response to thetracking bit being set.
 29. The machine readable medium of claim 28further comprising: instructions to clear the tracking bit after thecurrent state information has been output to the one of the plurality ofmemory areas.
 30. The machine readable medium claim 28 whereininstructions to sort state variables into a plurality of subgroupsfurther comprises: instructions to sort state variables associated withbasic state functions.
 31. The machine readable medium of claim 28wherein instructions to sort state variables into a plurality ofsubgroups further comprises: instructions to sort state variablesassociated with texture map functions.
 32. The machine readable mediumof claim 28 wherein instructions to sort state variables into aplurality of subgroups further comprises: instructions to sort statevariables associated with texture blend functions.
 33. The machinereadable medium of claim 28 wherein instructions to sort state variablesinto a plurality of subgroups further comprises: instructions to sortstate variables associated with slow state functions.
 34. The machinereadable medium of claim 28 wherein instructions to set the tracking bitin response to a change in a state variable since the primitiveinstructions were previously output to the memory areas furthercomprises: instructions to determine whether a state change hasoccurred; instructions to determine which subgroup is associated withthe state change; instructions to determine a new and previous value ofthe state variable; and instructions to set the corresponding subgrouptracking bit if the new and previous values differ.
 35. The machinereadable medium of claim 28 further comprises: instructions to define aplurality of additional memory areas for storing state information;instructions to define a reference for the plurality of additionalmemory areas; and instructions to store selected state group informationinto the plurality of additional memory areas.
 36. The machine readablemedium of claim 35 wherein instructions to output current stateinformation associated with the subgroup into one of the plurality ofmemory areas in response to the subgroup being set further comprises:instructions to output the reference associated with the subgroup intoone of the plurality of memory areas when the tracking bit is set; andinstructions, based on the reference, to retrieve referenced informationstored in one of the plurality of additional memory areas duringrendering.
 37. The machine readable medium of claim 36 whereininstructions to store selected state group information into theadditional memory areas further comprises: instructions to store statevariables associated with slow state functions into the plurality ofadditional memory areas.
 38. The machine readable medium of claim 37wherein instructions to define a reference for the additional memoryareas further comprises: instructions to define a pointer to theadditional memory areas where slow state information is stored.